部署的AI系统通常不起作用。它们可以随意地构造,不加选择地部署并欺骗性地促进。然而,尽管有这一现实,但学者,新闻界和决策者对功能的关注很少。这导致技术和政策解决方案的重点是“道德”或价值一致的部署,通常会跳过先前的问题,即给定系统功能或完全提供任何好处。描述各种功能失败的危害,我们分析一组案例研究,以创建已知的AI功能问题的分类法。然后,我们指出的是政策和组织响应,这些策略和组织响应经常被忽略,并在功能成为重点后变得更容易获得。我们认为功能是一项有意义的AI政策挑战,是保护受影响社区免受算法伤害的必要第一步。
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
对比性自我监督学习方法学会将图像(例如图像)映射到无需标签的情况下将图像映射到非参数表示空间中。尽管非常成功,但当前方法在训练阶段需要大量数据。在目标训练集规模限制的情况下,已知概括是差的。在大型源数据集和目标样本上进行微调进行预处理,容易在几杆方向上过度拟合,在几个弹药方面,只有少量的目标样本可用。在此激励的情况下,我们提出了一种用于自我监督的对比度学习的域适应方法,称为少数最大的学习方法,以解决对目标分布的适应问题,这些问题在几乎没有射击学习下。为了量化表示质量,我们在包括ImageNet,Visda和FastMRI在内的一系列源和目标数据集上评估了很少的最大最大速度,在这些数据集和FastMRI上,很少有最大最大的最大值始终优于其他方法。
translated by 谷歌翻译
语言模型既展示了定量的改进,又展示了新的定性功能,随着规模的增加。尽管它们具有潜在的变革性影响,但这些新能力的特征却很差。为了为未来的研究提供信息,为破坏性的新模型能力做准备,并改善社会有害的效果,至关重要的是,我们必须了解目前和近乎未来的能力和语言模型的局限性。为了应对这一挑战,我们介绍了超越模仿游戏基准(Big Bench)。 Big Bench目前由204个任务组成,由132家机构的442位作者贡献。任务主题是多样的,从语言学,儿童发展,数学,常识性推理,生物学,物理学,社会偏见,软件开发等等。 Big-Bench专注于被认为超出当前语言模型的功能的任务。我们评估了OpenAI的GPT型号,Google内部密集变压器体系结构和大型基础上的开关稀疏变压器的行为,跨越了数百万到数十亿个参数。此外,一个人类专家评估者团队执行了所有任务,以提供强大的基准。研究结果包括:模型性能和校准都随规模改善,但绝对的术语(以及与评估者的性能相比);在模型类中的性能非常相似,尽管带有稀疏性。逐渐和预测的任务通常涉及大量知识或记忆成分,而在临界规模上表现出“突破性”行为的任务通常涉及多个步骤或组成部分或脆性指标;社交偏见通常会随着含糊不清的环境而随着规模而增加,但这可以通过提示来改善。
translated by 谷歌翻译
通用形态(UNIMORPH)项目是一项合作的努力,可为数百种世界语言实例化覆盖范围的标准化形态拐角。该项目包括两个主要的推力:一种无独立的特征架构,用于丰富的形态注释,并以各种语言意识到该模式的各种语言的带注释数据的类型级别资源。本文介绍了过去几年对几个方面的扩张和改进(自McCarthy等人(2020年)以来)。众多语言学家的合作努力增加了67种新语言,其中包括30种濒危语言。我们已经对提取管道进行了一些改进,以解决一些问题,例如缺少性别和马克龙信息。我们还修改了模式,使用了形态学现象所需的层次结构,例如多肢体协议和案例堆叠,同时添加了一些缺失的形态特征,以使模式更具包容性。鉴于上一个UniMorph版本,我们还通过16种语言的词素分割增强了数据库。最后,这个新版本通过通过代表来自metphynet的派生过程的实例丰富数据和注释模式来推动将衍生物形态纳入UniMorph中。
translated by 谷歌翻译
超越地球轨道的人类空间勘探将涉及大量距离和持续时间的任务。为了有效减轻无数空间健康危害,数据和空间健康系统的范式转移是实现地球独立性的,而不是Earth-Reliance所必需的。有希望在生物学和健康的人工智能和机器学习领域的发展可以解决这些需求。我们提出了一个适当的自主和智能精密空间健康系统,可以监控,汇总和评估生物医学状态;分析和预测个性化不良健康结果;适应并响应新累积的数据;并提供对其船员医务人员的个人深度空间机组人员和迭代决策支持的预防性,可操作和及时的见解。在这里,我们介绍了美国国家航空航天局组织的研讨会的建议摘要,以便在太空生物学和健康中未来的人工智能应用。在未来十年,生物监测技术,生物标志科学,航天器硬件,智能软件和简化的数据管理必须成熟,并编织成精确的空间健康系统,以使人类在深空中茁壮成长。
translated by 谷歌翻译
空间生物学研究旨在了解太空飞行对生物的根本影响,制定支持深度空间探索的基础知识,最终生物工程航天器和栖息地稳定植物,农作物,微生物,动物和人类的生态系统,为持续的多行星寿命稳定。要提高这些目标,该领域利用了来自星空和地下模拟研究的实验,平台,数据和模型生物。由于研究扩展到低地球轨道之外,实验和平台必须是最大自主,光,敏捷和智能化,以加快知识发现。在这里,我们介绍了由美国国家航空航天局的人工智能,机器学习和建模应用程序组织的研讨会的建议摘要,这些应用程序为这些空间生物学挑战提供了关键解决方案。在未来十年中,将人工智能融入太空生物学领域将深化天空效应的生物学理解,促进预测性建模和分析,支持最大自主和可重复的实验,并有效地管理星载数据和元数据,所有目标使生活能够在深空中茁壮成长。
translated by 谷歌翻译
我们提出了两种新颖的编码联合学习(FL)方案,用于减轻乐曲设备的效果。第一种方案,CodedPaddedFL,减轻了乐谱装置的效果,同时保留了传统的隐私水平。特别地,它将一次性填充与梯度码相结合,以产生对讨论设备的弹性。要将一次性填充应用于真实数据,我们的计划利用数据的定点算术表示。对于具有25个设备的场景,CodedPaddedFL与传统FL相比,CodedPaddedFL分别在MM师和时尚-MNIST数据集中获得6.6和9.2的速度增速因子为6.6和9.2。此外,与Prakash \ Emph {等人}最近提出的方案相比,它在延迟方面产生了类似的性能。没有额外的私人数据泄漏的缺点。第二个方案CodedSecagg提供落后和防止模型反转攻击的稳健性,并基于Shamir的秘密共享。 CodedSecagg优先于最先进的安全聚合方案,如6.6-14.6的加速因子,这取决于拼写设备的数量,在具有120个设备的场景的MNIST数据集上,以牺牲与CodedPaddedFL相比,延迟增加了30 \%。
translated by 谷歌翻译
Quadruped robots are currently used in industrial robotics as mechanical aid to automate several routine tasks. However, presently, the usage of such a robot in a domestic setting is still very much a part of the research. This paper discusses the understanding and virtual simulation of such a robot capable of detecting and understanding human emotions, generating its gait, and responding via sounds and expression on a screen. To this end, we use a combination of reinforcement learning and software engineering concepts to simulate a quadruped robot that can understand emotions, navigate through various terrains and detect sound sources, and respond to emotions using audio-visual feedback. This paper aims to establish the framework of simulating a quadruped robot that is emotionally intelligent and can primarily respond to audio-visual stimuli using motor or audio response. The emotion detection from the speech was not as performant as ERANNs or Zeta Policy learning, still managing an accuracy of 63.5%. The video emotion detection system produced results that are almost at par with the state of the art, with an accuracy of 99.66%. Due to its "on-policy" learning process, the PPO algorithm was extremely rapid to learn, allowing the simulated dog to demonstrate a remarkably seamless gait across the different cadences and variations. This enabled the quadruped robot to respond to generated stimuli, allowing us to conclude that it functions as predicted and satisfies the aim of this work.
translated by 谷歌翻译